High performance clock domain crossing fifo

ABSTRACT

The disclosure relates to clock-crossing elements that may be used to transfer data between different clock domains. The embodiments include dual clock first-in first-out (FIFO) buffers that may employ toggle-based protocols to manage the transference of information regarding the state of the FIFO buffer. The toggle-based protocols may include a feedback-based handshake and bit-sliced toggle lines to prevent errors due to differences between the clock signals in the different clock domains.

BACKGROUND

This disclosure relates to methods and systems that perform datatransfer between multiple clock domains.

This section is intended to introduce the reader to various aspects ofart that may be related to various aspects of the present disclosure,which are described and/or claimed below. This discussion is believed tobe helpful in providing the reader with background information tofacilitate a better understanding of the various aspects of the presentdisclosure. Accordingly, it may be understood that these statements areto be read in this light, and not as admissions of prior art.

Certain electrical devices, including many data processing devices(e.g., computers, mobile phones, wearable devices) may includesynchronous circuitry. A synchronous circuit is a digital circuit thatoperates using a clock signal to synchronize the digital elements, suchas memory elements, flip-flops, and or latches. A region of circuitry inthe electrical device that operates synchronized to a common clock maybe called a clock domain. Many electrical devices may include multiplesynchronous circuits, each circuit synchronized to a different clocksignal. That is, the electrical device may have multiple clock domainswith different clock signals, which may differ in phases or frequencies.Data transfers between synchronous circuits in different clock domainsmay be implemented using clock crossing elements, such as first-infirst-out (FIFO) buffers may be used. During a transfer from atransmitting clock domain to a receiving clock domain, theclock-crossing element may receive data using the clock of thetransmitting clock domain and may provide data using the clock of thereceiving clock domain. As the demands for faster data processingincrease, clock speeds in synchronous circuitry of electrical devicesalso increases. As such, improvements in the clock crossing elements mayfacilitate the development of faster, more efficient electrical devices.

BRIEF DESCRIPTION OF THE DRAWINGS

Various aspects of this disclosure may be better understood upon readingthe following detailed description and upon reference to the drawings inwhich:

FIG. 1 is a block diagram of a programmable logic device that includes ahigh-performance clock domain crossing elements, in accordance with anembodiment;

FIG. 2 is a block diagram of a data processing system that may use theprogrammable logic device to provide fast data processing capabilities,in accordance with an embodiment;

FIG. 3 is a diagram of a programmable logic device configured withmultiple clock domains and a high-performance clock domain crossingfirst-in first-out (FIFO) buffer, in accordance with an embodiment;

FIG. 4 is a schematic diagram of a high-performance clock domaincrossing FIFO buffer, in accordance with an embodiment;

FIG. 5 is a schematic diagram of a controller for the high-performanceclock domain crossing FIFO buffer, in accordance with an embodiment;

FIG. 6 is a logic diagram of the toggle encoding, synchronization, andtoggle decoding circuitry in a high-performance clock domain crossingFIFO buffer, in accordance with an embodiment;

FIG. 7 is a flow chart of a method to write data to the high-performanceclock domain crossing FIFO buffer, in accordance with an embodiment;

FIG. 8 is a flow chart of a method to read data from thehigh-performance clock domain crossing FIFO buffer, in accordance withan embodiment;

FIG. 9 is a schematic illustration of a robust reset process that may beused with the high-performance clock domain crossing FIFO buffer, inaccordance with an embodiment; and

FIG. 10 is a flow chart of a method to perform robust reset for clockdomain crossing circuitry, in accordance with an embodiment.

DETAILED DESCRIPTION OF SPECIFIC EMBODIMENTS

One or more specific embodiments will be described below. In an effortto provide a concise description of these embodiments, not all featuresof an actual implementation are described in the specification. It maybe appreciated that in the development of any such actualimplementation, as in any engineering or design project, numerousimplementation-specific decisions must be made to achieve thedevelopers' specific goals, such as compliance with system-related andbusiness-related constraints, which may vary from one implementation toanother. Moreover, it may be appreciated that such a development effortmight be complex and time consuming, but would nevertheless be a routineundertaking of design, fabrication, and manufacture for those ofordinary skill having the benefit of this disclosure.

When introducing elements of various embodiments of the presentdisclosure, the articles “a,” “an,” and “the” are intended to mean thatthere are one or more of the elements. The terms “comprising,”“including,” and “having” are intended to be inclusive and mean thatthere may be additional elements other than the listed elements.Additionally, it should be understood that references to “oneembodiment” or “an embodiment” of the present disclosure are notintended to be interpreted as excluding the existence of additionalembodiments that also incorporate the recited features. Furthermore, thephrase A “based on” B is intended to mean that A is at least partiallybased on B. Moreover, unless expressly stated otherwise, the term “or”is intended to be inclusive (e.g., logical OR) and not exclusive (e.g.,logical XOR). In other words, the phrase A “or” B is intended to mean A,B, or both A and B.

The highly flexible nature of programmable logic devices makes them anexcellent fit for accelerating many computing tasks. Thus, programmablelogic devices are increasingly used as accelerators for machinelearning, video processing, voice recognition, image recognition, andmany other highly specialized tasks, particularly those that would betoo slow or inefficient in software running on a processor. The increasein the size and complexity of systems that may employ programmable logicdevices may lead to an increase the diversity of circuits, functionblocks, soft intellectual property blocks (soft IP blocks), hard IPblocks, soft processors, and/or other devices that share a common die ordevice. As a result, the programmable logic device may have multipleclock domains for the different circuits as well as clock domaincrossing elements to perform data transfer between circuits in differentdomains. As the speed and of the data processing functionalitiesincrease and, accordingly, the clock rates in the clock domains increasegenerating a demand for faster, more efficient clock domain crossingelements.

Data transfers between clock domains may be implemented usingsynchronizers with input registers clocked by the transmitting domainand output registers clocked by the receiver domain. Data transfersbetween clock domains may be implemented by dual clock first in firstout (FIFO) buffers. Dual clock FIFO buffers may be FIFO buffers thatreceive write data clocked by the transmitting domain and provide readdata clocked by the receiving domain. In order to prevent overflows(i.e., writing to a full FIFO buffer) and underflows (i.e., reading froman empty FIFO buffer), FIFO buffers may keep track the position of readand write data using pointers (i.e., read pointers, write pointers). Asthe update of the pointer information has also to cross clock domains,control circuitry may be used to provide safe transfer of updates to thepointers and to prevent glitches from clock collisions,desynchronization, or metastability.

Embodiments of the present application relate to clock domain crossingelements, such as FIFO buffers, that may employ a faster, robust pointermanagement circuit. The pointer management circuitry may use atoggle-based protocol to transfer buffer-related information related tothe FIFO buffer. The toggle-based protocol may use a toggle-basedencoding to provide information regarding the number of data elements(i.e., words) that were written or read across the domains. The receivedinformation may be used to update write pointers and read pointers. Thepointer management circuitry may include a feedback-based handshakeprocess to enable acknowledgment of the data transferred using thetoggle-based encoding. The pointer management circuitry may be morerobust at high frequency speeds, and may be scaled according to thedifference between the clock frequencies of the clock domains.

Moreover, in some embodiments of FIFO buffers that include pre-fetchread instructions, the pointer management circuitry discussed hereinallow the design of a simplified combinatorial circuitry in thegeneration of look-ahead logic, as detailed below. As a result, thepointer management circuitry may also allow improved implementation ofpre-fetch read instructions, which may reduce latencies in the FIFO andprovide performance in high-frequency applications. Robust reset (e.g.,initialization) procedures for the dual clock FIFO buffer are alsodisclosed. While the discussions described herein relate to FIFO buffersimplemented in programmable fabric, the methods and systems describedherein may be implemented in hardened digital circuitry as well, usingthe same descriptions and designs described herein.

By way of introduction, FIG. 1 illustrates a block diagram of a system10 that may employ a programmable logic device 12 that may implementdata processing functions in multiple clock domains of the programmablelogic devices and that may allow data transfers between different clockdomains using the FIFO buffers described herein. Using the system 10, adesigner may implement a circuit design functionality on an integratedcircuit, such as the reconfigurable programmable logic device 12, (e.g.,an FPGA).

The designer may implement a circuit design to be programmed onto theprogrammable logic device 12 using design software 14, such as a versionof Intel® Quartus® by Intel Corporation of Santa Clara, Calif. Thedesign software 14 may use a compiler 16 to generate a low-levelcircuit-design defined by bitstream 18, sometimes known as a programobject file and/or configuration program, which programs theprogrammable logic device 12. In the process of compiling the bitstream18, the design software may assign different clock domain regions of theprogrammable logic device 12 to operate using a local clock forsynchronization. To transfer data between different clock domains, thedesign software 14 may include programming instructions for a FIFObuffer (e.g., allocation of memory for the FIFO, configuration of logicthat controls the FIFO buffer) detailed below in the bitstream 18.Instructions to perform write requests and write requests for the datatransfer may also be included in the bitstream 18.

The compiler 16 may, thus, provide machine-readable instructionsrepresentative of the circuit design to the programmable logic device 12in the form of one or more bitstreams 18. The configuration program(e.g., bitstream) 18 may be programmed into the programmable logicdevice 12 as a configuration program 20. The configuration program 20may, in some cases, represent an accelerator function to perform formachine learning, video processing, voice recognition, imagerecognition, or other highly specialized task. As discussed above, theconfiguration program may be distributed across multiple clock domainsin the programmable logic device 12 and may include data transfersbetween different clock domains.

The programmable logic device 12 may be, or may be a component of, adata processing system 50, as shown in FIG. 2. The data processingsystem 50 may include a host processor 52, memory, storage circuitry 54,and a network interface 56. The data processing system 50 may includemore or fewer components (e.g., electronic display, user interfacestructures, application specific integrated circuits (ASICs)). The hostprocessor 52 may include any suitable processor, such as an Intel® Xeon®processor or a reduced-instruction processor (e.g., a reducedinstruction set computer (RISC), an Advanced RISC Machine (ARM)processor) that may manage a data processing request for the dataprocessing system 50 (e.g., to perform machine learning, videoprocessing, voice recognition, image recognition, data compression,database search ranking, bioinformatics, network security patternidentification, spatial navigation, or the like).

The memory and/or storage circuitry 54 may include random access memory(RAM), read-only memory (ROM), one or more hard drives, flash memory, orthe like. The memory and/or storage circuitry 54 may be consideredexternal memory to the programmable logic device 12 and may hold data tobe processed by the data processing system 50 In some cases, the memoryand/or storage circuitry 54 may also store configuration programs(bitstreams 18) for programming the programmable logic device 12. Thenetwork interface 56 may allow the data processing system 50 tocommunicate with other electronic devices. The data processing system 50may include several different packages or may be contained within asingle package on a single package substrate.

In one example, the data processing system 50 may be part of a datacenter that processes a variety of different requests. For instance, thedata processing system 50 may receive a data processing request via thenetwork interface 56 to perform machine learning, video processing,voice recognition, image recognition, data compression, database searchranking, bioinformatics, network security pattern identification,spatial navigation, or some other specialized task. The host processor52 may cause the programmable logic fabric of the programmable logicdevice 12 to be programmed with a specific accelerator that is relatedto requested task. For instance, the host processor 52 may cause theconfiguration data (e.g., bitstream 18) to be stored on the storagecircuitry 54 or cached in a memory of the programmable logic device 12to be, later, programmed into the programmable logic fabric of theprogrammable logic device 12. The configuration data (e.g., bitstream18) may represent a circuit design for a specific accelerator functionrelevant to the requested task.

FIG. 3 illustrates a programmable logic device 12. The programmablelogic device 12 may include programmable fabric 112. In the illustratedembodiment, the programmable fabric 112 may be arranged in an array ofsectors 114. Each sector 114 may include a sector controller and asector-specific configuration memory and may store configuration dataassociated with that sector.

The programmable logic device may also have input/output (I/O) circuitry116. The I/O circuitry 116 may include, among other things, protocolcircuitry, transceivers, amplifiers, clock-and-data recovery circuitry,and eye detectors. The I/O circuitry 116 may be configured to access amemory device (e.g., a high bandwidth memory (HBM), dynamicrandom-access memory (RAM) device), or to connect to other electronicdevices using a communications protocol, such as an Ethernet protocol, aperipheral component interconnect express (PCIe) protocol, or auniversal serial bus (USB) protocol. The programmable fabric 112 mayalso include a Network on Chip (NoC) 120 and/or hardened interconnectlines 122 that may provide low latency access between sectors 114 andthe I/O circuitry 116.

The illustrated programmable logic device 12 may have a first functionblock 102 in a first clock domain and a second function block 104 in asecond clock domain. For example, the first function block 102 may besynchronized by a clock signal 106 whereas the second function block 104may be synchronized by a different clock signal 108. In general, clockdomains may be means of clock trees in the programmable logic device 12.It should be noted that a clock domain may cover a portion of a sector114, a single sector 114, multiple sectors 114, or any other region ofthe programmable logic device 12. In order to perform data transfersbetween the first function block 102 and the second function block 104,the dual clock FIFO buffer 110 is used.

FIG. 4 provides a diagram 140 of a dual clock FIFO buffer 110. The dualclock FIFO buffer 110 may have a memory 142 that implements the buffer,and a controller 144 that manages the memory 142. To that end, thecontroller 144 may have pointers 145 to a read position and a writeposition in the memory 142. In FIFO buffer 110, reading data from thememory 142 may cause an increment in the read pointer and writing datato the memory may cause an increment in the write pointer. A buffer maybe empty or full when the read pointer and the write pointer coincide.As the reading and writing processes take place in different clockdomains, the controller 144 may have synchronizing circuitry thatprevents inconsistencies between the pointer due to timing mismatches ortiming collisions. As detailed in FIG. 5, the synchronizing circuitrymay include a toggle-based protocol that transfers updates to thepointers across the domains using toggle encoding and has anacknowledgment handshake using feedback.

The incoming data 146 may be received by circuitry in a write clockdomain 141A and outgoing data 147 may be transmitted to circuitry in aread clock domain 141B. The write clock domain 141A may be synchronizedby a write clock signal 148 and the read clock domain 141B may besynchronized the read clock signal 150. Circuitry in the write clockdomain 141A may interact with the controller 144 using signals 152 s.Similarly, circuitry in the read clock domain 141B may interact with thecontroller 144 using signals 154. Signals 152 and 154 may be used toperform write requests, status requests, and to obtain information(e.g., buffer full, buffer empty) that may be used to control the dataflow from the functional blocks to the FIFO buffer 110.

FIG. 5 illustrates a pointer management circuitry 200 in the controller144. As discussed above, the pointer management circuitry 200 may beresponsible for transferring updates to registers that store the readand write pointers between the write clock domain 141A and the readclock domain 141B. As such, the pointer management circuitry 200 mayinclude circuitry in the write clock domain 141A, circuitry in the readclock domain 141B, and clock-crossing synchronizers with feedback. Asdiscussed above, the transfer of the updates may be performed usingtoggle-encoded information across the clock-crossing synchronizers. Thefeedback may be used by a clock domain to determine that the counterpartclock domain across the clock-crossing synchronizers updated theirregisters. In order to prevent data losses, the pointer managementcircuitry 200 may include a register that preserves pending updatetransfers, as detailed below.

During the process of sending data from the write clock domain 141A(i.e., writing data to the FIFO buffer 110), logical circuitry that isin the write clock domain 141A may send write requests 152A and may reada FIFO status 152B. A write request may cause increments in a “writepending” (WP) register 202 and in a “write used” (WU) register 204. Forexample, if N words were written to the FIFO buffer and the FIFO bufferhad the capacity to store the N words, WP register 202 and WU register204 may be incremented by N. The WP register 202 may keep the number ofpending updates to the read clock domain and the WU register 204 maykeep the number of total used words in the FIFO buffer 110. If thenumber stored in the WU register 204 is equal to the buffer capacity,the FIFO status 152B may be a signal indicating that FIFO buffer 110 isfull. This signal may be used to prevent overflow of the FIFO buffer110.

The toggle encoder 206 in the write domain may encode the number ofwords that were written and is stored in the WP register 202. The toggleencoder 206 may include a set of 1-bit toggles in binary encoding. Assuch, a 1-bit toggle encoder may allow updates of 1-bit at a time, a2-bit toggle encoder may allow updates of up to 3 bits, and a 3-bittoggle encoder may allow updates of up to 7 bits (e.g., 7 words or 7transactions). The encoded information is sent to the synchronizer 208A,which is synchronized to the read clock domain 141B. The toggle-encodedinformation is sent from the synchronizer 208A to the toggle decoder 210in the read clock domain 141B. The feedback synchronizer 208B, which issynchronized to the write clock domain 141A may also read the output ofthe synchronizer 208A in the read clock domain. As such, the feedbacksynchronizer 208B may provide an acknowledgement that the read clockdomain 141A received the information sent by the write clock domain141B.

The toggle encoder 206 may receive the acknowledgement from the feedbacksynchronizer 208B and provide an update 211 to the WP register 202. Forexample, if the feedback synchronizer 208B indicates that the read clockdomain 141B acknowledged an update indicating an increment of N words inthe FIFO buffer, the WP register 202 may be decremented by N. The toggledecoder 210 may convert the information from the 1-bit toggled linesinto a decoded update 213 to the “read used” (RU) register 212. Forexample, if the sensed update indicated that N words were written, theRU register 212 may be incremented by N. The RU register 212 mayindicate the number of available words in the FIFO buffer 110 tocircuitry in the read clock domain 141B.

The process of reading data from the FIFO buffer 110 may, similarly, beassociated with updates to the registers in the write clock domain 141A.Logical circuitry that is in the read clock domain 141B may send datarequests 154A and may read a FIFO status 154B. A data request 154A, whenassociated with data retrieval, may cause decrements in the RU register212 and increments in a “read pending” (RP) register 214. For example,if N words were requested from the FIFO buffer and the FIFO buffer hadat least N stored words, RU register 212 may be decremented by N and theRP register 214 may be incremented by N. The RP register 214 may keepthe number of pending updates to the write clock domain 141A and the RUregister 212 may keep the number of total available words in the FIFObuffer 110. If the number stored in the RU register 212 is equal tozero, the FIFO status 154B may be a signal indicating that FIFO buffer110 is empty. This signal may be used to prevent underflow of the FIFObuffer 110.

The toggle encoder 216 in the read clock domain 141B may encode thenumber of words that were read, which is stored in the RP register 214.The toggle encoding may be similar to the one described above. Thetoggle encoder 206 may include a set of 1-bit toggles that performbinary encoding. As such, a 1-bit toggle encoder may allow updates of1-bit at a time, a 2-bit toggle encoder may allow updates of up to 3bits, and a 3-bit toggle encoder may allow updates of up to 7 bits. Theencoded information is sent to the synchronizer 218A, which issynchronized to the write clock domain 141A. The toggle-encodedinformation is sent from the synchronizer 218A to the toggle decoder 220in the write clock domain 141A. The feedback synchronizer 218B, which issynchronized to the read clock domain 141B may also read the output ofthe synchronizer 218A in the write clock domain. As such, the feedbacksynchronizer 218B may provide as an acknowledgement that the write clockdomain 141B received the information sent by the read clock domain 141A.

As with the toggle encoder 206, the toggle encoder 216 may receive theacknowledgement from the corresponding feedback synchronizer 218B andprovide an update 221 to the RP register 214. For example, if thefeedback synchronizer 218B indicates that the write clock domain 141Aacknowledged an update indicating that N words were read from the FIFObuffer, the RP register 214 may be decremented by N. The toggle decoder230 may convert the information from the 1-bit toggled lines into adecoded update 222 to the WU register 204. For example, if the toggledecoder 230 indicates that N words were written, the WU register 204 maybe decremented by N, thus allowing the circuitry in the write clockdomain 141A to write additional words to the FIFO buffer 110.

As discussed above, the toggle encoders 206 and 216 and the toggledecoders 210 and 220 may implement a toggle-encoded protocol. To thatend, the encoders and the decoders may be coupled by 1-bit lines acrossthe synchronizer blocks. Each 1-bit line may be toggled independently toimplement a binary code. For example, a 2-bit toggle encoding may beimplemented by an encoder with two 1-bit toggles coupled to a decodervia two 1-bit lines. In this example, the toggle encoder may switch thelowest-order 1-bit line when the input is 1, may switch thehighest-order 1-bit line when the input is 2, and may toggle both whenthe input is 3. Such method may prevent failures due timing mismatchesbetween the clock domains, as the 1-bit lines are independent. Moreover,the feedback mechanism employs bit slicing. That is, each 1-bit line mayoperate independently to prevent a toggle from occurring before anacknowledgment. The independence between the 1-bit lines may also relaxtiming constraints as clock skew between the different 1-bit lines donot affect the data transfer.

The above-described arrangement may facilitate pre-fetch operationsduring the reading process. Specifically, the use of the above-describedtoggle encoding described above may limit the number of possibletransactions per cycle, which facilitates the design of a look-aheadlogic to control pre-fetch operations in the FIFO. As discussed above,in a dual clock FIFO controller that employs 2-bit toggle encoding, themaximum number of transactions per cycle is 3 whereas in a dual clockFIFO controller that employs 3-bit toggle encoding, the maximum numberof transactions per cycle is 7. As a result of the limited possiblechanges per transaction, the possibilities for changes in the RUregister 212 from the toggle decoder 210 are constrained, allowing adesign of a look-ahead logic with a simpler combinatorial logic. As thecontrol of the FIFO pre-fetch operations may employ look-ahead logic,the use of the toggle encoding described above, may facilitate thedesign of pre-fetch operations, which impacts FIFO latency and overallperformance. The simpler look-ahead logic described above may also beused to generate backpressure signals to the write-domain circuitry ofthe write clock domain 141A during in the write process.

In some embodiments the specific encoding (e.g., the number of 1-bitlines) may be determined based on the difference between the clock ratesin the clock domains. As such, the synthesis process and/or the circuitsynthesis process discussed in FIG. 2 may choose a suitable toggleencoding that prevents overflow and/or overflow events.

FIG. 6 provides an illustration 300 of the coupling between the toggleencoder 206 and the toggle decoder 210 across the synchronizers 208A andfeedback synchronizer 208B. While the descriptions relate to circuitrythat transmits updates from the write clock domain 141A to the readclock domain 141B, the circuitry that transmits updates from the readclock domain 141B to the write clock domain 141A may be designed in asimilar manner. In the illustrated circuit, WP register 202 may beupdated based on write requests 152A or updates 211, discussed above.The value stored in the WP register 202 may be sent to the toggleencoder 206. A logic 302 may be used to determine if there is a pendingupdate.

A logic 304 may use information from the feedback synchronizer 208B todetermine if the sent update was received by the read clock domain 141B.To that end, logic 304 may compare the toggle encoder output 306 is thesame as the feedback synchronizer output 308. Identity between outputs306 and 308 indicates that the previously transmitted toggle update wasreceived and, therefore, the toggle encoder may safely provide a newupdate by toggling its output 306. As such, logics 302 and 304 may, incombination, verify if the synchronizer blocks are ready fortransmission of a new update. The combination of the output of logics302 and 304 may also implement the update 211 to the WP register 202discussed above.

In the illustration 300 of FIG. 6, synchronizer 208A and feedbacksynchronizer 208B are implemented as 3 back-to-back registers.Synchronizer 208A is clocked by the read clock domain 141B and feedbacksynchronizer 208B is clocked by the write clock domain 141A. Suchimplementation with 3 toggles may guarantee that the output of thetoggle may be constant for at least 3 clock edges (e.g., 1.5 periods) ofthe read clock domain 141B and the output of the feedback may beconstant for at least 3 clock edges (e.g., 1.5 periods) of the writeclock domain 141A. This implementation might provide robustness insituations where the mismatch between the clocks in the clock domains issubstantial. Consider, for example, a 3-bit toggle encoding, having a1st-order bit line, a 2nd-order bit line, and a 3rd-order bit line. Suchencoding may allow up to 7 FIFO transactions per update. Thethree-register arrangement discussed above guarantees that each updateuses, at most, 6 clocks edges of the slowest clock domain. As a result,the FIFO may allow up to 7 FIFO transactions every 6 clocks of theslowest clock domain, which should be sufficient for many applications.

With the foregoing in mind, FIGS. 7 and 8 provide flowcharts 350 and370, respectively, related to methods performed by the high-performanceclock crossing FIFO when transferring data across clock domains, as theones described above. Flowchart 350 in FIG. 7 illustrates thetransmission of data in the sending clock domain (i.e., FIFO writes) andflowchart 370 in FIG. 8 illustrates the reception of data in thereceiving clock domain (i.e., FIFO reads).

In a first block 352 of the flowchart 350 of FIG. 7, the FIFO mayreceive a request for a FIFO write. In a decision block 354, the FIFOmay verify if it has available space. To that end, the FIFO may comparethe value stored in memory (e.g., a register) in the write domain withthe size of the buffer. If the FIFO does not have space, the FIFO mayemit an error and halt the write process. If the FIFO has availablespace, the FIFO may write the data to its memory in block 356. In block358, the FIFO may generate an update signal using a toggle encoding, asdiscussed above. The update signal may be based on the number of datawords that were written to the memory. The toggle encoding may beassisted by the use of memory (e.g., a register) that keeps track ofpending update signals. In block 360, a toggle decoder may receive theupdate signal and decode the information for use in the read clockdomain. The decoding in block 360 may be accompanied by a feedbacksignal which may update the memory that keeps track of pending updatesignals discussed above. In block 362, memory (e.g., a register) in theread clock domain may be updated based on the decoded updated signal.The updated memory may be used by circuitry in the read clock domain toperform safe reads from the FIFO.

The flowchart 370 of FIG. 8, initiates in a block 372, in which the FIFOmay receive a request for a FIFO read. In a decision block 374, the FIFOmay verify if there are available words. To that end, the FIFO maycompare the value stored in memory (e.g., a register) in the read domainwith zero. If the FIFO is empty, the FIFO may emit an error and halt theread process. If the FIFO has available words, the FIFO may providequeued data from its memory in block 376. In block 378, the FIFO maygenerate an update signal using a toggle encoding, as discussed above.The update signal may be based on the number of data words that wereread from the memory. The toggle encoding may be assisted by the use ofmemory (e.g., a register) that keeps track of pending update signals. Inblock 380, a toggle decoder may receive the update signal and decode theinformation for use in the write clock domain. The decoding in block 380may be accompanied by a feedback signal which may update the memory thatkeeps track of pending update signals discussed above. In block 382,memory (e.g., a register) in the write clock domain may be updated basedon the decoded updated signal. The updated memory may be used bycircuitry in the write clock domain to perform safe write operationsinto the FIFO.

FIGS. 9 and 10 illustrate a method to implement resets in theabove-discussed FIFO. During initialization, the number of clock cyclesfor a reset command to propagate along a pipeline, such as the FIFOpipelines, may be large and, as a result, circuitry coupled to the FIFOmay exit a reset before the FIFO. Such situation may lead the circuitrycoupled to the FIFO to receive erroneous data (e.g., FIFO status data,read data from the FIFO memory). The method illustrated in diagram 400of FIG. 9 and flowchart 450 of FIG. 10 provides a safe state during thereset. Following a reset or initialization of the electronic device inblock 452, asynchronous reset signal 402 may be asserted in boundarycircuitry 410, during block 454 may be asserted. In block 456, writeclock 404 and read clock 406 may be ungated to perform initialization ofthe read domain internal registers 412 and the write domain internalregisters 414. As the boundary circuitry 410 is being reset with anasynchronous reset signal 402, the internal registers 412 and 414 maysafely perform the pipelined reset in block 456. At the end of thenumber of clock cycles used for achieving the proper internal state ofthe FIFO, the asynchronous reset signal 402 may be deasserted in block458 to exit the reset process.

In FIFO systems, the boundary circuitry 410 may refer to registers thatindicate the available space in the FIFO, such as the above-discussedpointers or registers. As such, if the asynchronous reset signal 402causes the write pointer and the read pointer registers to coincide or,alternatively, causes the WU register to indicate a full buffer and theRU register to indicate an empty buffer, functional circuitry will notattempt to write or to read data to the FIFO, preventing unsafeoperations. More generally, boundary circuitry 410 may refer, ingeneral, to interface circuitry of a functional block that may preventinteractions with the functional block in a reset state. Therefore, itshould be understood that the reset operations described in FIGS. 9 and10 may be extended to other functional blocks and soft IPs with theappropriate adaptations.

The methods and devices of this disclosure may be incorporated into anysuitable circuit. For example, the methods and devices may beincorporated into numerous types of devices such as microprocessors orother integrated circuits. Exemplary integrated circuits includeprogrammable array logic (PAL), programmable logic arrays (PLAs), fieldprogrammable logic arrays (FPLAs), electrically programmable logicdevices (EPLDs), electrically erasable programmable logic devices(EEPLDs), logic cell arrays (LCAs), field programmable gate arrays(FPGAs), application specific standard products (ASSPs), applicationspecific integrated circuits (ASICs), and microprocessors, just to namea few.

Moreover, while the method operations have been described in a specificorder, it should be understood that other operations may be performed inbetween described operations, described operations may be adjusted sothat they occur at slightly different times or described operations maybe distributed in a system which allows the occurrence of the processingoperations at various intervals associated with the processing, as longas the processing of overlying operations is performed as desired.

The embodiments set forth in the present disclosure may be susceptibleto various modifications and alternative forms, specific embodimentshave been shown by way of example in the drawings and have beendescribed in detail herein. However, it may be understood that thedisclosure is not intended to be limited to the particular formsdisclosed. The disclosure is to cover all modifications, equivalents,and alternatives falling within the spirit and scope of the disclosureas defined by the following appended claims. In addition, the techniquespresented and claimed herein are referenced and applied to materialobjects and concrete examples of a practical nature that demonstrablyimprove the present technical field and, as such, are not abstract,intangible or purely theoretical. Further, if any claims appended to theend of this specification contain one or more elements designated as“means for [perform]ing [a function]. . . ” or “step for [perform]ing [afunction]. . . ,” it is intended that such elements are to beinterpreted under 35 U.S.C. 112(f). For any claims containing elementsdesignated in any other manner, however, it is intended that suchelements are not to be interpreted under 35 U.S.C. 112(f).

What is claimed is:
 1. A dual clock first-in first-out (FIFO) buffer,comprising: a FIFO memory; a read used register in a read clock domainof the FIFO, wherein the read used register stores a number of availablewords in the FIFO memory; a write pending register in a write clockdomain of the FIFO, wherein the write pending register stores a numberof pending updates to the read used register; a write-domain encoder inthe write clock domain of the FIFO, wherein the write-domain encodergenerates an update signal based on the number of pending updates; aclock-crossing synchronizer in the read clock domain of the FIFO thatreceives the update signal; a read-domain decoder in the read clockdomain of the FIFO that receives the update signal from theclock-crossing synchronizer and provides a decoded update signal to theread used register; and a feedback synchronizer in the write clockdomain of the FIFO that receives the update signal from theclock-crossing synchronizer in the read clock domain and updates thewrite pending register.
 2. The dual clock FIFO buffer of claim 1,comprising: a write used register in the write clock domain of the FIFO,wherein the write used register stores a number of words in the FIFOmemory; a read pending register in the read clock domain of the FIFO,wherein the read pending register stores a second number of pendingupdates to the write used register; a read-domain encoder in the readclock domain of the FIFO, wherein the read-domain encoder generates asecond update signal based on the second number of pending updates; asecond clock-crossing synchronizer in the write clock domain of the FIFOthat receives the second update signal; a write-domain decoder in thewrite clock domain of the FIFO that receives the second update signalfrom the second clock-crossing synchronizer and provides a seconddecoded update signal to the write used register; and a second feedbacksynchronizer in the read clock domain of the FIFO that receives thesecond update signal from the second clock-crossing synchronizer in thewrite clock domain and updates the read pending register.
 3. The dualclock FIFO buffer of claim 2, wherein the FIFO is provides, to the writeclock domain, a signal indicative that the FIFO is full based on thewrite used register.
 4. The dual clock FIFO buffer of claim 1, whereinthe FIFO is provides, to the read clock domain, a signal indicative thatthe FIFO is empty based on the read used register.
 5. The dual clockFIFO buffer of claim 1, wherein the clock-crossing synchronizercomprises 3 registers clocked by a read clock of the read clock domain.6. The dual clock FIFO buffer of claim 1, wherein the feedbacksynchronizer comprises 3 registers clocked by a write clock of the writeclock domain.
 7. The dual clock FIFO buffer of claim 1, comprising logicthat compares a first output of the write-domain encoder and a secondoutput of the feedback synchronizer to generate a signal that controlsthe write-domain encoder.
 8. The dual clock FIFO buffer of claim 1,wherein the write-domain encoder comprises a toggle encoder, and whereinthe read-domain decoder comprises a toggle decoder.
 9. The dual clockFIFO buffer of claim 8, wherein the toggle encoder comprises a 3-bittoggle code, and wherein the toggle encoder and the toggle decoder areconnected through 3 1-bit connections of the clock-crossingsynchronizer.
 10. The dual clock FIFO buffer of claim 1, wherein theFIFO comprises pre-fetch circuitry that comprises look-ahead logiccoupled to the read used register or the read-domain decoder, or both.11. The dual clock FIFO buffer of claim 1, wherein the dual clock FIFOis implemented in application-specific integrated circuit (ASIC) or in afield programmable gate array (FPGA).
 12. A non-transitory computerreadable media comprising instructions to generate a bitstreamcomprising a soft intellectual property (IP) block for a programmablelogic device, wherein the soft IP block comprises: a first-in first-out(FIFO) memory; and a FIFO controller comprising: a read used memorysynchronized by a read clock of the programmable logic device, whereinthe read used memory stores a number of available words in the FIFOmemory; a write pending memory synchronized by a write clock of theprogrammable logic device, wherein the write pending memory stores anumber of pending updates to the read used memory; a write-domain toggleencoder synchronized by the write clock, wherein the write-domain toggleencoder generates a toggle-encoded signal based on the number of pendingupdates; a clock-crossing synchronizer synchronized by the read clockthat receives the toggle-encoded signal; and a read-domain toggledecoder synchronized by the read clock that generates an update signalbased on the toggle-encoded signal received from the clock-crossingsynchronizer and provides a decoded signal to the read used memory,wherein the write-domain toggle encoder and the read-domain toggledecoder are coupled by a plurality of 1-bit lines via the clock-crossingsynchronizer, and wherein the toggle-encoded signal comprises abit-sliced signal.
 13. The non-transitory computer readable media ofclaim 12, wherein the FIFO controller comprises a feedback synchronizersynchronized by the write clock that updates the write pending memorybased on an output of the clock-crossing synchronizer.
 14. Thenon-transitory computer readable media of claim 12, wherein the FIFOcontroller comprises: a write used memory synchronized by the writeclock, wherein the write used memory stores a number of words in theFIFO memory; a read pending memory synchronized by the read clock,wherein the read pending memory stores a second number of pendingupdates to the write used memory; a read-domain toggle encodersynchronized by the read clock, wherein the read-domain toggle encodergenerates a second toggle-encoded signal based on the second number ofpending updates; a second clock-crossing synchronizer synchronized bythe write clock that receives the second toggle-encoded signal; and awrite-domain toggle decoder synchronized by the write clock thatgenerates a second update signal based on the second toggle-encodedsignal received from the second clock-crossing synchronizer and providesa second decoded update signal to the write used memory, wherein theread-domain toggle encoder and the write-domain toggle decoder arecoupled by a plurality of 1-bit lines via the second clock-crossingsynchronizer, and wherein the toggle-encoded signal comprises abit-sliced signal.
 15. The non-transitory computer readable media ofclaim 14, wherein the FIFO controller comprises a second feedbacksynchronizer synchronized by the read clock that updates the readpending memory based on an output of the second clock-crossingsynchronizer.
 16. The non-transitory computer readable media of claim12, wherein the instructions to generate the bitstream comprisesdetermining a difference between a read clock frequency and a writeclock frequency and determining a number of 1-bit lines in the pluralityof 1-bit lines based on the difference.
 17. The non-transitory computerreadable media of claim 12, wherein the FIFO controller compriseslook-ahead logic configured to perform a FIFO pre-fetch function basedon the read used memory or the read-domain toggle decoder, or both. 18.A system comprising an electronic device that comprises a first clockdomain, a second clock domain, and a dual-clock first-in first-out(FIFO) buffer controller, comprising: a read used register in the firstclock domain of the FIFO, wherein the read used register stores a numberof available words in a FIFO memory; a write pending register in thesecond clock domain of the FIFO, wherein the write pending registerstores a first a number of pending updates to the read used register; awrite-domain encoder in the second clock domain of the FIFO, wherein thewrite-domain encoder generates a first update signal based on the firstnumber of pending updates; a first clock-crossing synchronizer in thefirst clock domain of the FIFO that receives the first update signal; aread-domain decoder in the first clock domain of the FIFO that receivesthe first update signal from the first clock-crossing synchronizer andprovides a first decoded update signal to the read used register; and afirst feedback synchronizer in the second clock domain of the FIFO thatreceives the first update signal from the first clock-crossingsynchronizer in the first clock domain and updates the write pendingregister. a write used register in the second clock domain of the FIFO,wherein the write used register stores a number of words in the FIFOmemory; a read pending register in the first clock domain of the FIFO,wherein the read pending register stores a second number of pendingupdates to the write used register; a read-domain encoder in the firstclock domain of the FIFO, wherein the read-domain encoder generates asecond update signal based on the second number of pending updates; asecond clock-crossing synchronizer in the second clock domain of theFIFO that receives the second update signal; a write-domain decoder inthe second clock domain of the FIFO that receives the second updatesignal from the second clock-crossing synchronizer and provides a seconddecoded update signal to the write used register; and a second feedbacksynchronizer in the first clock domain of the FIFO that receives thesecond update signal from the second clock-crossing synchronizer in thesecond clock domain and updates the read pending register.
 19. Thesystem of claim 18, wherein the first clock-crossing synchronizercomprises a plurality of independent 1-bit lines coupling thewrite-domain encoder and the read-domain decoder.
 20. The system ofclaim 18, wherein the electronic device comprises a field-programmablegate array (FPGA) or an application-specific integrated circuit (ASIC),or both.